Distribution-free robust linear regression

نویسندگان

چکیده

We study random design linear regression with no assumptions on the distribution of covariates and a heavy-tailed response variable. In this distribution-free setting, we show that boundedness conditional second moment given is necessary sufficient condition for achieving nontrivial guarantees. As starting point, prove an optimal version classical in-expectation bound truncated least squares estimator due to Gy\"{o}rfi, Kohler, Krzy\.{z}ak, Walk. However, procedure fails constant probability some distributions despite its performance. Then, combining ideas squares, median-of-means procedures, aggregation theory, construct non-linear excess risk order $d/n$ sub-exponential tail. While existing approaches focus proper estimators return functions, highlight improperness our attaining guarantees in setting.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distribution-Free Distribution Regression

‘Distribution regression’ refers to the situation where a response Y depends on a covariate P where P is a probability distribution. The model is Y = f(P ) + μ where f is an unknown regression function and μ is a random error. Typically, we do not observe P directly, but rather, we observe a sample from P . In this paper we develop theory and methods for distribution-free versions of distributi...

متن کامل

Robust Sequential Prediction in Linear Regression with Student's t-distribution

The Predictive Least Squares (PLS) model selection criterion is known to be consistent in the context of linear regression. For small sample sizes, however, it can exhibit erratic behavior. We show that this shortcoming can be amended by incorporating a Student’s t-distribution into PLS. The resulting criterion is shown to be asymptotically equivalent to PLS but significantly more robust for sm...

متن کامل

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

Robust High-Dimensional Linear Regression

The effectiveness of supervised learning techniques has made them ubiquitous in research and practice. In high-dimensional settings, supervised learning commonly relies on dimensionality reduction to improve performance and identify the most important factors in predicting outcomes. However, the economic importance of learning has made it a natural target for adversarial manipulation of trainin...

متن کامل

Robust linear least squares regression

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematical statistics and learning

سال: 2022

ISSN: ['2520-2316', '2520-2324']

DOI: https://doi.org/10.4171/msl/27